Visualizing Bipartite Network Data using SAS® Visualization Tools

نویسنده

  • John Zheng
چکیده

Bipartite networks refer to the networks created from affiliation relationships, such as patient-provider relationships in healthcare data. Analyzing such networks allows us to gain additional insights on groups that share the same member and members that belong to the same group. This paper develops methods to visualize bipartite network data in healthcare using the Annotate facility. Link analysis is used to explore the connections between healthcare providers and patients, and hence connections between peer providers. In addition, this paper also takes advantage of the spatial information in the patient-provider data to gain additional insights. In particular, the connections between providers are projected on a Google map using Google Map Generator. INTRODUCTION Prior work has been published on using SAS to visualize and to compute network statistics of classical social networks (Hoyle 2006, Hornibrook 2009, Ellis 2009). This paper focuses on the visualization of bipartite networks, which are the networks that consist of two disjoint sets of nodes and all edges must be between the two sets. Bipartite networks are often used to model affiliation relationships, such as the patient-provider network or the consumer-seller network. In this paper, special attention is paid to uncovering the hidden connections in patientprovider bipartite networks. Instead of directly applying social network analysis and visualization techniques on the original bipartite network, we first convert the original bipartite network into a classical (unipartite) network of providers that is defined by patient-sharing. Once the network of providers is established, of interests are those unnaturally dense cliques within the network. The clustered providers are displayed distinctively and suspicious patterns flagged for further study. Large networks posts one of the greatest challenges in visualizing social networks. In the patient-provider network, for example, there are over 13,000 health care providers registered under NPI (National Provider Identification) in the tri state area of Maryland, Virginia and DC, and 230,000+ FFS Medicare Part B patients in calendar year 2009. When dealing with the big data set such as this, the first visualization challenge comes from laying out the nodes. Multidimensional scaling (MDS) is often used to produce spatial layout of the nodes based on their “similarity.” However, MDS does not scale well with increasing number of nodes. So far one method has been used to overcome the difficulties in laying out a very large network with MDS. The first method (Hoyle 2009) uses a variation of MDS that was used to generate coordinates for the node dataset. When spatial information (such as address or zip code) is available, one can also skip the MDS entirely by projecting all nodes on a Google map using Google Map Generator provided by SAS/GRAPH. This paper discusses these two methods as well as a third method, component analysis, to dissect the original network into many smaller, manageable sub-networks. THE REPRESENTATION OF SOCIAL NETWORK DATA IN SAS Social network data can be represented in one of two formats in SAS, the matrix format and the edge-list format. The matrix format stores social network data in an n-by-n square matrix with all nodes listed across the top and down the side. The value of cell(i,j) in the matrix indicates whether the node i and the node j is connected (0 – no connection; Posters NESUG 2010

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing an Ontology for Knowledge Discovery in Iran’s Vaccine

Ontology is a requirement engineering product and the key to knowledge discovery. It includes the terminology to describe a set of facts, assumptions, and relations with which the detailed meanings of vocabularies among communities can be determined. This is a qualitative content analysis research. This study has made use of ontology for the first time to discover the knowledge of vaccine in Ir...

متن کامل

Optimizing a Radial Layout of Bipartite Graphs for a Tool Visualizing Security Alerts

Effective tools are crucial for visualizing large quantities of information. While developing these tools, numerous graph drawing problems emerge. We present solutions for reducing clutter in a radial visualization of a bipartite graph representing the alerts generated by an IDS protecting a computer network. Our solutions rely essentially on (i) unambiguous edge bundling to reduce the number o...

متن کامل

Design and Test of the Real-time Text mining dashboard for Twitter

One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...

متن کامل

Accommodating IPv6 Addresses in Security Visualization Tools

Visualization is used by security analysts to help detect patterns and trends in large volumes of network traffic data. With IPv6 slowly being deployed around the world, network intruders are beginning to adapt their tools and techniques to work over IPv6 (vs. IPv4). Many tools for visualizing network activity, while useful for detecting large scale attacks and network behavior anomalies still ...

متن کامل

Visualizing Real-Time Network Resource Usage

We present NetGrok, a tool for visualizing computer network usage in real-time. NetGrok combines well-known information visualization techniques—overview, zoom & filter, details on demand—with network graph and treemap visualizations. NetGrok integrates these tools with a shared data store that can read PCAP-formatted network traces, capture traces from a live interface, and filter the data set...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010